home *** CD-ROM | disk | FTP | other *** search
- Path: nntp.teleport.com!sschaem
- From: sschaem@teleport.com (Stephan Schaem)
- Newsgroups: comp.sys.amiga.programmer,comp.sys.amiga.games,alt.sys.amiga.demos,comp.sys.amiga.misc
- Subject: Re: AB3D II beats Quake....
- Followup-To: comp.sys.amiga.programmer,comp.sys.amiga.games,alt.sys.amiga.demos,comp.sys.amiga.misc
- Date: 28 Mar 1996 02:08:34 GMT
- Organization: Teleport - Portland's Public Access (503) 220-1016
- Message-ID: <4jcsb2$8te@nadine.teleport.com>
- References: <74000105753944194756@BIRDLAND> <10017.6659T1424T209@mbox.vol.it> <4jbcno$7m9@soleil.uvsq.fr>
- NNTP-Posting-Host: kelly.teleport.com
- X-Newsreader: TIN [version 1.2 PL2]
-
- Nicolas Pomarede (pomarede@isty-info.uvsq.fr) wrote:
- : >I see again a future for CISC: as the most expert of you know, the RISC haven't
- : >been invented yesterday, they are old as me. The RISC vs CISC "war" has always
- : >been combacted with RAM speed as main arm: when the RAM were relatively slow,
- : >the CISC was faster than RISC; when the RAM technology was faster than CPU's
- : >one, the RISC's were faster than CISC's. CPU technology has grown up a lot
-
- : I don't really agree with this one. Everything depends on the build-in cache
- : and the memory it uses. RISC has been designed to provide a more simplified
- : intruction set that should be easier and faster to decode.
- : Anyway, there's actually no RAM that can cope today with a 68 Mhz CPU
- : (at least not on a personnal computer). 68 Mhz means 14.7 ns RAM ; I don't
- : think you can have many megs of this kind of RAM. This kind of RAM is only
- : used in 8/32 K cache (or in VRAM), but in that case, both RISC and CISC
- : can use this RAM (SRAM), so the RAM don't make a big difference IMO.
-
- I think L1 cache even at 200mhz offer 0 waitsate... and I also recall
- PC having 512k of 8ns L2 cache, for 99$.
- larger system like servers can sport 4meg of cache.
-
- : >Advantages? 80x86 can contain upto 4 instruction codes into 32bit, while the
- : >PowerPC can contain only 1. This means that parallelization will allow 80x86
- : >to run 4 times faster than the fastest of PPC.
-
- : The problem is not the size of the opcode, it's the time needed to read it.
- : All modern CPU read opcode in one cycle, and since buses are 32 or 64 bits,
- : reading 8 bit is usually a waste of power.
-
-
- He was talking about cache fill. usually 16bytes are read at a time.
- For a cisc like the x86 this can be equal to 16 instruction, on a risc
- 4... This would really mater if the CPU had no L1 or L2 cache.
-
- : One of the big problem of the Pentium and 680x0 is that the opcode are not
- : of the same size. 680x0 can have opcode of 2,4,8,10 bytes and x86 can even
- : have opcode of one byte (those compatible with 286).
-
- Yes, and it a 'problem' that intel with work took to their advantage.
- The 680x0 for example could have a risc core that only implement 16bit
- instruction, the other would use interpreted code. You have to make
- the compiler aware of this so it optimize your code using the best
- instructions... I think this is what happen with 16bit code on P6.
-
- : On the other hand, RISC CPU have all their opcode with the same length
- : (usually 32 bits) ; in fact, while RISC was meant to have a Reduce Instruction
- : Set, this is today not that true. RISC is now mostly caracterized by the fact
- : that opcode all have the same size, which allow massive predecoding of the
- : instruction flow. RISC CPU can predecode up to 8 instructions in parallel,
- : because they know that every 32 bits there's a new intruction.
-
- Actually the P6 can issue 3 instruction per cycle... its a very good
- performer in the integer area. Something that chip like the R5000
- do is mul+add in 1 insruction, more common to risc is move+logical operation
- in 1 instruction.
-
- move.l d0,d1
- add.l d2,d1
-
- take 32bit, on a risc can look like
-
- add d1,d0,d2
-
- Also 32bit
-
- : On CISC, it's not possible, because opcode are not 32 bit aligned. This means
- : that before decoding intstruction i, you must decode instructions 0 to i-1.
-
- Thats not a problem really... x86 nowdays have a risc core and decode the
- x86 'language'. I heard that maybe 18% of the P6 is actually x86 related
- the rest is just risc design.
-
- : This way RISC can also implement powerful branch prediction, which tend to
- : add no overhead whether the branch is taken or not.
- : Such prediction technology are not usable in CISC ; using them would mean
- : adding thousand of transistors that could be used to speed up other
- : instructions.
-
- The P6 seem to show that cisc with alot of effort can perform pretty well.
-
-
- : >
- : >We'll get soon BiCMOS technology for CPU's: 700Mhz, while the RAM will run at
- : >a speed hugely inferior. That day (in 1-2 years) having more concentrate
- : >programs (80x86 = upto 8 instructions every 64bit / PowerPC = upto 2) and
- : >being able to perform more things with each instruction ( = CISC philosophy)
- : >will outperform RISC's of lotsa times.
-
- : Again, I don't agree. The problem is not the size of the opcode, but the time
- : needed to execute it. Allowing 1 byte opcode means you won't be able
- : to do pipelining and predecoding of the instructions flow.
- : I don't think any chip firm today would go that way, ie using 1 byte
- : opcode.
-
- Its hard to say what would be the best instruction size/format...
-
- : For the 700 Mhz, I don't know if this will be reached in standard computer.
- : 300/400 Mhz will certainly, but IMO, the next step in speeding up will be
- : having more than 1 CPU in parallel. From this point of view, I think the
- : BeBox is a good first attempt in this direction.
-
- Risc/cisc are going that way yes... Personaly I see the future in multiple
- core CPU and new programing language (getting away from the linear
- flow design)
-
- : >Intel is not dumb, they said 3 years ago what I understood nowadays.
- : >Time for other people to understand it as well.
- : >
- : Intel is producing mass CPU, not clever CPU. I'm much more interested in
- : work and advices from HP, MIPS, ...
-
-
- Intel also design advance risc that even SGI used for high end geometry
- engine. HP also use intel risc in mass quatity. Intel is not stupid and
- has ALOT of resource to take crap design like the x86 and turn it around
- to be a performer.
-
- : >The only thing that can allow a future to RISC is ~5ns RAM, which is unlikely
- : >to happen, indeed.
- : >
- : >So, we are back again.
- : >The AmigaPPC604 (note: expensive high-end model) is not standard yet, while the
- : >200Mhz PentiumPro is already available and going to be surpassed soon by the
- : >new much faster 80686/80786 "lotsa-Pentiums-into-a-chip" processors.
- : >
-
- : One of the big problem with the x86, is the poor number of register and the way
- : they have to be used. Really, having 32 or 64 regs (PPC) greatly helps
- : speeding up execution (as an ASM supporter, I think you will agree on the
- : importance of the number of registers).
-
- I think the x86 use 'stack' register... I dont think the MIPS cPU support
- indexed store, but a X86 can use this VS register
-
- add virtual_register1,register1
-
- On the 040 this
-
- add.l (virtual_register1,a7),d0
-
- is not slower then
-
- add.l d1,d0
-
-
- : One of the proof to this is that a 68060/66 runs at the same speed as a P133
- : (this has been shown with some lightwave rendering, taking approx 12-13 min
- : on both machines).
-
- I think this also show that the 680x0 as been optimized over a few years.
- Also the x86 is not knowed for its floating point performance.
-
- : For me, future will be be having many RISC CPU in parallel. This is already
- : done in SGI renderer, and I doubt they would do it if it wasn't worth.
-
- You can go to www.mips.com (MIPS is owned by SGI, like cray) and check
- out their R5000 (would be nice for a high end amiga, the low end could
- use R4600).
-
-
- : From what I read in your previous post, you seem to be asm addicted and not
- : really pro-C. In fact, I was like you a few years ago ; I only swear by ASM;
- : I wanted to rewrite everything in asm, patch slow functions in the OS, etc...
- : But I changed my mind, I'm now also working on HPUX server, that run C code
- : quite fast. A general rule in computing says that 90% of the CPU power is spent
- : in only 10% of the whole code of a program. In that case, writting a portable
- : OS is possible (you could then make specific asm speed up on some platform).
-
- 100% asm is silly in 99% of design (I love estimation:), but in rare case
- like doing a rendering engine you might want to stay 100% asm, just so you
- dont have to handle a few scatered C file and a C compiler /interfacing :)
-
- So far asm as been simple, but CPU get more complex now and compiler
- are bound to do better jobs overall. only a few expert will be left
- optimizing really time critical tight loops :)
-
- : I still enjoy coding in asm, but I'm also open to other language.
- : I agree with you on the point that next Amiga shouldn't only have some kind
- : of SVGA cards. For me, a dream machine would be a mixing of a fast PPC604
- : and sthg like the PSX or Saturn video hardware.
-
- I consider 680x0 a fun easy language :) x86 a pain in the brain...
- PSX/saturn 3d is pale comapre to the U64. The PSX use a 33mhz R3000,
- the U64 a 100mhz R4000... The U64 video HW with a 1280x1024 res would
- be very nice, specially because it would come from a mass produced
- chipset , so cheap. The U64 R4000 100mhz could be the geometry
- engine sitting next to the R5000 (400mflop for <300$) main cpu.
-
- Stephan
-
-